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Abstract 


Viruses in the genus Coronavirus are currently placed in three groups based on antigenic cross-reactivity and 
sequence analysis of structural protein genes. Consensus polymerase chain reaction (PCR) primers were used to 
obtain cDNA, then cloned and sequenced a highly conserved 922 nucleotide region in open reading frame (ORF) 1b 
of the polymerase (po/) gene from eight coronaviruses. These sequences were compared with published sequences for 
three additional coronaviruses. In this comparison, it was found that nucleotide substitution frequencies (per 100 
nucleotides) varied from 46.40 to 50.13 when viruses were compared among the traditional coronavirus groups and, 
with one exception (the human coronavirus (HCV) 229E), varied from 2.54 to 15.89 when compared within these 
groups. (The substitution frequency for 229E, as compared to other members of the same group, varied from 35.37 
to 35.72.) Phylogenetic analysis of these pol gene sequences resulted in groupings which correspond closely with the 
previously described groupings, including recent data which places the two avian coronaviruses—infectious bronchitis 
virus (IBV) of chickens and turkey coronavirus (TCV)—in the same group [Guy, J.S., Barnes, H.J., Smith L.G., 
Breslin, J., 1997. Avian Dis. 41:583—590]. A single pair of degenerate primers was identified which amplify a 251 bp 
region from coronaviruses of all three groups using the same reaction conditions. This consensus PCR assay for the 
genus Coronavirus may be useful in identifying as yet unknown coronaviruses. © 1999 Elsevier Science B.V. All rights 
reserved. 
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The genus Coronavirus is in the family Coro- 
naviridae in the order Nidovirales (Cavanagh, 
1997). Viruses of this order have linear, non-seg- 
mented, positive-sense, single-stranded RNA 
genomes with similar genomic organization and a 
nested set of subgenomic mRNAs. Members of 
the genus Coronavirus infect birds and mammals, 
causing respiratory, enteric, cardiovascular and 
neurological disease (Holmes and Lai, 1996). The 
coronaviruses were originally divided into three 
groups based on antigenic relatedness of the struc- 
tural proteins (Sturman and Holmes, 1983; Sid- 
dell, 1995), which include the 
haemagglutinin-esterase (HE), spike (S), integral 
membrane (M) and nucleocapsid (N) proteins. 
Genes encoding these proteins are clustered at the 
3’ end of the 27-31 kb coronavirus genome. 
However, the most highly conserved genomic se- 
quences are found in the 20 kb polymerase (pol) 
gene, which covers the 5’ two-thirds of the coro- 
navirus genome (Snijder and Spaan, 1995). The 
pol gene contains two large open reading frames 


Table 1 


(ORFs), ORF la and ORF 1b. Within ORF 1b, 
there are very highly conserved regions encoding 
conserved functions (e.g. polymerase and helicase 
activity) which, combined with similarities in 
replication and expression strategies, demonstrate 
an evolutionary link among coronaviruses, ar- 
teriviruses, and toroviruses. These similarities 
form the rationale for placing these viruses in the 
order Nidovirales (Snider et al., 1990, 1991; den 
Boon et al., 1991; Godeny et al., 1993; Snijder 
and Spaan, 1995; Cavanagh, 1997). Therefore, the 
highly conserved structure and function of viral 
polymerases make the pol a logical region for 
making phylogenetic comparisons, as well as for 
developing a consensus polymerase chain reaction 
(PCR) assay which could be used for the identifi- 
cation of novel coronaviruses. This strategy has 
been used with other viruses, particularly papillo- 
maviruses (Bernard et al., 1994; Astori et al., 
1997). Such an assay would be useful because 
possible novel coronaviruses have been tentatively 
identified (e.g. using electron microscopy) in asso- 


Amino acid (italics; on top, right-hand side) and nucleotide substitution rates (per 100 residues) in a highly conserved region of open 


reading frame (ORF) 1b of the po/ gene of 11 coronaviruses 


HEV BCV OC43 MHV SDAV IBV TCV FIPV TGEV CCV 229E 
HEV 0.98 2.99 8.27 7.19 38.38 40.56 44.90 45.48 44,32 48.45 
BCV 2.54 1.98 8.27 7.19 38.38 40.56 43.75 44,32 43.18 48.45 
OC43 3.21 3.33 9.01 7.91 38.38 40.56 44.90 44.90 44,32 48.45 
MHV 15.89 15.09 15.35 0.98 38.92 41.11 47.85 47.85 47.85 50.28 
SDAV 14.16 13.51 14.03 4.12 37.84 40.01 46.65 46.65 46.65 49.05 
IBV 49.50 49.08 48.25 49.50 47.01 3.00 45.68 46.27 45.68 47.46 
TCV 49.92 49.92 48.66 50.13 48.04 7.19 47.46 48.06 47.46 49.28 
FIPV 50.11 49.06 49.90 49.69 47.82 46.40 48.25 0.33 0.33 24.03 
TGEV 49.06 48.85 48.65 49.27 48.03 47.63 48.87 3.33 0.65 24.48 
CCV 48.23 47.82 48.03 49.27 47.00 47.22 49.92 4.01 4.58 24.48 
229E 49.48 48.44 48.44 50.54 49.69 48.87 48.04 35.37 35.72 35.54 


Fig. 1. Deduced amino acid sequence of the polymerase motif region from open reading frame (ORF) 1b of the po/ gene of 11 
coronaviruses. The mouse hepatitis virus (MHV), infectious bronchitis virus (IBV) and 229E sequences are derived from published 
sequences (see text). The first amino acid in this figure corresponds to amino acid 466 of the IBV ORF Ib (Lee et al., 1991). Amino 
acids 79 through 307 correspond to 228 of the 258 amino acids representing the conserved polymerase motif common to 
coronaviruses, toroviruses and arteriviruses (see Fig. 5 in den Boon et al., 1991). The highly conserved SDD or GDD polymerase 
motif (Poch et al., 1989) is identified by asterisks. Capitalized letters indicate amino acids which are conserved in all 11 sequences. 
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1 

£NKFGKArLYYEalsfEEQD 
f£NKFGKArLYYEalsfEEQD 
£NKFGKArLYYEalsfEEQD 
£NKFGKArLYYEalsfEEQD 
£NKFGKArLYYEalsfEEQD 


ENKFGKArLYYEms1 . EEQD 
£NKFGKArLYYEms1 . EEQD 


INKFGKArLYYEt lsyEEQD 
INKFGKArLYYEt 1lsyEEQD 
INKFGKArLYYEt 1syEEQD 
INKFGKAgLYYEsisyEEQD 


-NKFGKA-LYYE- - - -EEQD 


81 

PVVIGtTKFYGGWDdMLrrL 
PVVIGtTKFYGGWDdMLrrL 
PVVIGtTKFYGGWDdMLrrL 
PVVIGtTKFYGGWDdMLrrL 
PVVIGtTKFYGGWDdMLrrL 


SVVIGtTKFYGGWDnMLrnL 
PVVIGtTKFYGGWDnMLrnL 


tVVIGSTKFYGGWDnMLknL 
tVVIGSTKFYGGWDnMLknL 
tVVIGsSTKFYGGWDnMLknL 
tVVIGtTKFYGGWDnMLknL 


-VVIG-TKFYGGWD-ML--L 


161 

ivmegGcyYVKPGGTsSGDa 
ivmegGcyYvKPGGTsSGDa 
ivmecgGcyYvKPGGTsSGDa 
ivmcgGcyYVKPGGTsSGDa 
ivmegGcyYVKPGGTsSGDa 


tVlatGgiYvKPGGTsSGDa 
tVlatGgiYvKPGGTsSGDa 


vVhctGgfYfKPGGTtSGDg 
vVhctGgfY£KPGGTtSGDg 
vVhctGgfY£KPGGTtSGDg 
vVysnGgfYfKPGGTtsGDa 


-V---G--Y-KPGGT-SGD- 


241 

YeFLnKhFSMmILSDDgVVC 
YeFLnKhFSMmILSDDgVVvCc 
YeFLnKhFSMmILsSDDgvvc 
YeFLnKhFSMiILsDDgVVC 
YeFLnKhFSMmILSDDgVvCc 


Y¥sYLcKnFSLmILsDDgVvCc 
YsYLcKnFSLmIFaDDgVVC 


FsYLrKhFSMmILSDDgVvc 
FsYLrKhFSMmILSDDgVVC 
FsYLrKhFSMmILSDDgVVvCc 
YgYLqKhFSMmILsDDsvvc 


Y-YL-K-FSM-IL-DD-vvc 


RK 


21 

eiYayTKrNVLPT1TQMNLK 
eiYayTKrNVLPT1TQMNLK 
eiYayTKrNVLPT1TOQMNLK 
eiYayTKrNVLPT1TQMNLK 
eiYayTKrNVLPT1TQMNLK 


qlFeiTKkNVLPTiTQMNLK 
ql FesTKkKNVLPTiTQMNLK 


alFalTKrNVLPTmTQMNLK 
alFalTKrNVLPTmTOQMNLK 
alFalTKrNVLPTmTQMNLK 
aiFs1TKrNILPTmTQLNLK 


~-F--TK-NVLPT-TQMNLK 


81 

ikdVDnpvLMGWDYPKCDRA 
ikdVDnpvLMGWDYPKCDRA 
ikdVDnpvLMGWDYPKCDRA 
ikdVDspvLMGWDYPKCDRA 
ikdVDspvLMGWDYPKCDRA 


iqgVEdpiLMGWDYPKCDRA 
iqgVEdpiLMGWDYPKCDRA 


mrdVDngcLMGWDYPKCDRA 
mrdVDngcLMGWDYPKCDRA 
mrdVDngcLMGWDYPKCDRA 
madVDdpkLMGWDYPKCDRA 


~--VD---LMGWDYPKCDRA 


181 

TTAFANSVFNIcQAvSaNVc 
TTAFANSVFNIcQAvSaNVc 
TTAFANSVFNIcQAvVSaNVc 
TTAFANSVFNIcQAvSaNVc 
TTAFANSVFNIcQAvSaNVc 


TTAYANSVFNIiQAtSaNVa 
TTAYANSVFNIiQAtSaNVa 


TTAYANSaFNIfQAVSaNVn 
TTAYANSaFNIf£QAvVSaNVn 
TTAYANSaFNIfQAvSaNvn 
TTAYANSVFNIfQAVSsNIn 


TTAYANS - FNI -QA-S-NV- 


261 

YdsdYAskGyIAnIsaFqqv 
YnsdYAskGyIAnIsaFqqv 
YnsdYAskGyIAnIsaFqqv 
YnseFAskGyIAnIsdFqqv 
YnseFAskGyIAnIsaFqqv 


YnntLAkgG1VAdIsgFrev 
YnntLAkqGl1VAdIsgFrei 


YnkdYAd1GyVAdInaFkat 
YnkdYAd1GyVAdInaFkat 
YnkdYAd1GyVAdInaFkat 
Ynkt YAg1GyIAdIsaFkat 


Y---YA--G-IA-I--F--- 


41 

YAISaKnRARTVaGVSiLsT 
YAISaKnRARTVaGVSiLsT 
YAISaknRARTVaGVSiLsT 
YAISaKnRARTVaGVSiLsT 
YAISaKnRARTVaGVSiLsT 


YAISaKnRARTVaGVSiLsT 
YAISaKnRARTVaGVSiLsT 


YAISgKaRARTVgGVS1LsT 
YAISgKaRARTVgGVS1LsT 
YAISgKaRARTVgGVS1LsT 
YAISgKeRARTVgGVS1LaT 


YAIS-K-RARTV-GVS-L-T 


121 

MPnilRivssLVLarKHeaC 
MPnilRivssLVLarKHeaC 
MPn11RivssLVLarkHetc 
MPnilRivssLVLarKHdsc 
MPnilRivssLVLarkHdscC 


MPn11RiaasLVLarkHtnc 
MPn11RitasLVLarkHtnc 


LPnmiRmasaMILgsKHvgC 
LPnmiRmasaMILgsKHvgC 
LPnmiRmasaMILgsKHvgC 
MPsmiRmlsaMILgsKHvtC 


MP---R----LVL--KH--C 


201 

alMscngnkiedlsIralQk 
alMscngnkiedlsIralQk 
alMscngnkiedlsIralQk 
slMacnghkiedlsIrelQk 
slMacnghkiedlsIrelQk 


rlLsvitrdivydniksloy 
rllsvitrdivyddiksloy 


klLgvdsnacnnvtVksiQr 
klLgvdsnacnnvtVksiQr 
klLgvdsnacnnvtVksiQr 
cevLsvnssnennfnvkklor 


281 

LYYQNnVFMsesKCWvVEnDi 
LYYOQNnNVFMsesKCWVEnDi 
LYYQNnVFMsesKCWVEhDi 
LYYQNnVFMseaKCWVEtDi 
LYYQNnVFMseaKCWvEtDi 


LYYQNnVFMadsKCWvEpD1 
LYYQNnVYMadsKCWvEpD1 


LYYQNnVFMstsKCWVEpD1 
LYYQNnVFMstsKCWvEpD1 
LYYQNnVFMst sKCWVEpD1 
LYYQNGVFMstaKCWtEeD1 


LYYQN-VFM- - -KCW-E-D- 


Fig. 1. (Continued) 


183 


61 80 
MTgRmFHQKcCLKSlaaTRgv 
MTgRmFHQKcLKSIaaTRgv 
MTgRmFHQKcLKSIaaTRgv 
MTgRMFHQKcCLKSLaaTRgv 
MTgRmFHQKcLKSIaaTRgv 


MTnRqFHQKiLKSIvnTRna 
MTnRqFHQKiLKSIvnTRna 


MTtRqYHQKhLKSIaaTRna 
MTtRqYHQKhLKSIaaTRna 
MTtRQYHQKhLKSIaaTRna 
MTtRqQFHQKcLKSIvaTRna 


MT-R-FHQK-LKSI--TR-- 


141 160 
CsqsdrfYRLaNECAQVLSE 
CsqsdrfYRLaNEcAQVLSE 
CsqrtrfYRLaNECAQVLSE 
CshtdrfYRLaNEcAQVLgE 
CshtdrfYRLaNEcCAQVLSE 


CswseriYRLyNECAQVLSE 
CtwseriYRLyNECAQVLSE 


CthsdrfYRLsNELAQVLtE 
CthndrfYRLsNELAQVLtE 
CthsdrfYRLsNELAQVLtE 
Ctasak£YRLsNELAQVLtE 


Gases YRL-NE-AQVL-E 


221 240 
rlYshvYRndmvDstFVtey 
rlYshvYRsdmvDstFVtey 
rlYshvYRsdkvDstFVtey 
rlYsnvYRadhvDpaFVsey 
rlYsnvYRadhvDpaFVsey 


elYqqvYRrvnfDpaFVekF 
elYqqvYRrvnfDpaFVekF 


kiYdncYRsssiDeeFVvey 
kiYdncYRsssiDeeFVveYy 
kiYdncYRsssiDeeFVvey 
qlYdncYRnsnvDesFVddF 


~-Y---YR----D--Fv--y 


307 
nkGPHEF 
nnGPHEF 
nnGPHEF 
ekGPHEF 
ekGPHEF 


ekGPHEFP 
ekGPHEF 


SVGPHEF 
SVGPHEF 
nvGPHEF 
siGPHEF 


--GPHEF 
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Fig. 2. Unrooted dendogram showing Kimura’s distances (rep- 
resented by branch lengths) for cDNA sequences from a 922 
nucleotide region of open reading frame (ORF) 1b of the pol 
gene of 11 coronaviruses (see text for details). Numbers repre- 
sent the results of a bootstrap analysis and indicate the 
number of times out of 100 iterations that these branch points 
were identified. Sequence for the eight coronavirus sequences 
reported here is available from GenBank under the following 
accession numbers: bovine coronavirus (BCV), AF124985; ca- 
nine coronavirus (CCV), AF124986; feline infectious peritoni- 
tis virus (FIPV), AF124987; hemagglutinating encephalo- 
myelitis virus of swine (HEV), AF124988; OC43, AF124989; 
sialodacryoadenitis virus of rats (SDAV), AF124990; turkey 
coronavirus (TCV), AF124991; transmissible gastroenteritis 
virus (TGEV), AF124992. 


ciation with a variety of human and animal dis- 
eases, but further characterization and definitive 
identification of these agents as coronaviruses has 
been difficult (Resta et al., 1985; Myint, 1995; 
Guy et al., 1997). 

For these reasons a highly conserved 922 nucle- 
otide region in ORF 1b of the po/ gene of eight 
coronaviruses were recently cloned and sequenced 
using consensus PCR primers. This region has 
previously been completely sequenced for two 
group | viruses, human coronavirus (HCV)-229E 
(Herold et al., 1993) and transmissible gastroen- 


Fig. 3. cDNA sequences for a subregion of the 922 nucleotides 
from open reading frame (ORF) 1b of the po/ gene used for 
the analysis shown in Fig. 2. (Nucleotide number | of this 922 
nucleotide-long region corresponds to nucleotide number 
13 853 in the infectious bronchitis virus (IBV) po/ sequence 
(Boursnell et al., 1987). This Figure shows nucleotides number 
101 through 400. The regions targeted by the two degenerate 
primers (CV2Bp and CV4Bm, see text for sequence) used in 
the consensus polymerase chain reaction (PCR) assay for the 
genus Coronavirus are underlined. 
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101 

cTtACTCAaa 
ceTtACTCAaa 
cTtACTCAaa 
cTaACTCAaa 
tTaACTCAaa 


aTaACTCAaa 
aTaACTCAaa 


aTgACTCAaa 
aTgACTCAaa 
aTgACTCAga 
aTgACTCAgt 


TgAATtTgAA 
TgAATtTgAA 
TgAATtTgAA 
TgAATCTtAA 
TQAATCTtAA 


TQAATtTaAA 
TQAATtTaAA 


TgAATtTGAA 
TQAATtTQAA 
TgAATtTgAA 
TaAATCTtAA 


ATAtGCtATt 
ATAtGCtATt 
ATAtGCtATt 
ATAtGCtATt 
ATAtGCtATt 


ATAtGCcATa 
ATAtGCcATa 


ATAtGCtATt 
ATACGCtATt 
ATAtGCtATt 
ATACGCcATa 


-T-ACTCA-- T-AAT-T-AA ATA-GC-AT- 


Cv2Bp 
151 
cACtGTtGcet 
cACtGTtGct 
cACtGTtGct 
cACcGTtGct 
cACtGTtGct 


tACaGTgGca 
tACaGTgGca 


tACaGTaGga 
tACaGTaGga 
tACaGTaGga 
tACaGTgGgt 
-AC-GT-G-- 


201 

AtCAaAAatg 
AtCAaAAatg 
AtCAaAAatg 
AtCAaAAgtg 
AtCAaAAgtg 


AtCAgAAgat 
AtCAgAAgat 


AtCAgAAgca 
AtCAgAAgca 
AcCAgAAgca 
AtCAgAAatg 
A-CA-AA--- 


251 

ATaGGcaCcA 
ATaGGcaCcA 
ATaGGcaCcA 
ATaGGcaCcA 
ATaGGcaCcA 


ATtGGaaCaA 


ATtGGgaCaA , 


ATtGGttCaA 
ATtGGttCaA 
ATtGGctCaA 
ATcGGcaCtA 
AT-GG--C-A 


301 

tATtaaaGaT 
tATtaaaGaT 
tATtaaaGaT 
tATtaaaGaT 
tATtaaaGaT 


gATtcagGgT 
gATtcaaGgT 


gATgcgtGaT 
aATgcgtGaT 
aATgcgtGaT 
gATggccGaT 
-AT----G-T 


351 

GTGAtcGtGC 
GTGAtcGtGC 
GTGAtcGtGC 
GTGAtcGtGC 
GTGAtcGtGC 


GTGAtaGaGC 
GTGAtaGaGC 


GTGAccGtGC 
GTGAccGtGC 
GTGAccGcGC 
GTGAtaGaGC 
GTGA--G-GC 


GGtGTtTCca 
GGtGTtTCca 
GGtGTtTCca 
GGtGTcTCta 
GGtGTcTCca 


GGtGTgTCta 
GGtGTgTCta 


GGaGTtTCac 
GGaGTtTCac 
GGaGTtTCac 
GGeGTcTCctt 
GG-GT-TC-- 


ctTgAAaagt 
ttTgAAaagt 
ttTgAAaagt 
ttTaAAgagt 
ttTaAAgagt 


tcTtAAgtct 
tcTtAAgtct 


ttTgAAgtca 
ttTgAAgtca 
ttTgAAgtca 
tcTgAAatcc 
woT-AA-<=- 


CtAAaTTtTA 
CtAAgTTtTA 
CtAAaTTItTA 
CgAAgTTcTA 
CgAAQTTtTA 


CcAAQTTtTA 
CcAAQTTtTA 


CcAAgTTtTA 
CcAAgTTtTA 
CcAAgTTtTA 
CcAAQTTtTA 
C-AA-TT-TA 


GTTGAtaatc 
GTTGAtaatc 
GTTGAcaatc 
GTTGAtagtc 
GTTGAtagtc 


GTTGAagacc 
GTTGAagacc 


GTTGAtaacg 
GTTGAtaatg 
GTTGAtaatg 
GTTGAtgatc 
GTTGA----- 


taTgCCaaac 
taTgCCaaac 
taTgCCaaac 
taTgCCaaac 
taTgCCaaac 


aaTgCCtaat 
aaTgCCaaat 


ttTaCCtaat 
ttTaCCtaat 
ttTacCtaat 
taTgCCctca 
--T-CC---- 


Fig. 


TacTtagtAC 
TacTcagtAC 
TacTtagtAC 
TtcTcagtAC 
TecTtagtAc 


TeeTttctAc 
TecTttccAc 


TtceTttctAc 
TtcTttctAc 
TtcTttctAc 
TatTagctAC 
T--T----AC 


ATaGcagctA 
ATaGcagctA 
ATaGcagctA 
ATaGcagctA 
ATaGcagctA 


ATaGtcaacA 
ATaGtcaatA 


ATtGctgcaA 
ATtGctgcaA 
ATtGctgcaA 
ATaGtagctA 
AT-G----- A 


tGGcGGcTGG 
tGGcGGcTGG 
tGGtGGcTGG 
eGGcGGtTGG 
tGGcGGtTGG 


tGGcGGtTGG 
tGGcGGtTGG 


tGGtGGtTGG 
tGGtGGtTGG 
tGGtGGtTGG 
tGGcGGgTGG 
-GG-GG-TGG 


ctgtacTtaT 
ctgtacTtAT 
etgtacTtAT 
etgtacTcAT 
ctgtacTtaAT 


caattcTtAT 
ctattcTtaT 


gttgttTgAT 
gttgttTgAT 
gttgttTgAT 
ctaaatTgAT 
oe T-AT 


aTacTacGtA 
aTacTacGtA 
cTacTacGtA 
aTacTgcGtA 
aTacTacGtA 


tTgtTgcGta 
tTgcTacGtA 


aTgaTtaGaA 
aTgaTtaGaA 
aTgaTcaGaA 
aTgaTtcGtA 
-T--T--G-A 


3. 


agtGccAAga 
agtGctAAga 
agtGctAAga 
agtGctAAga 
agtGctAAga 


tecGcgAAaa 
tecGcgAAaa 


tctGgtAAgg 
tctGgtAAgg 
tctGgaAAgg 
tctGgtAAgg 
---G--AA-- 


tATGACTggc 
tATGACTggc 
tATGACTggc 
tATGACTggc 
tATGACTggc 


tATGACTaat 
tATGACTaat 


cCATGACTacg 
CATGACTacg 
cATGACTacg 
tATGACTaca 
-ATGACT- - - 


CacGtggcGt 
CacGtggtGt 
CacGtggtGt 
CteGtggtct 
CtcGtggtGt 


CtaGaaatGe 
CtaGaaatGc 


CacGcaatGe 
CacGcaatGe 
CacGcaatGc 
CcaGaaatGe 
C--G----G- 


GAtgAtATGt 
GAtgAtATGt 
GAtgAtATGt 
GAtgAtATGt 
GAtgAcATGt 


GAcaAcATGt 
GAcaAtATGt 


GAcaAcATGc 
GAcaAtATGc 
GAtaAcATGc 
GAtaAtATGt 
GA--A-ATG- 


GGGtTGGGAt 
GGGt TGGGAt 
GGGtTGGGAt 
GGGtTGGGAc 
GGGtTGGGAc 


GGGtTGGGAt 
GGGgTGGGAt 


GGGgTGGGAc 
GGGaTGGGAc 
GGGaTGGGAc 
GGGaTGGGAc 


150 
ataGaGCcCG 
ataGaGCcCG 
ataGaGCcCG 
ataGgGCccG 
ataGaGCcCG 


ataGaGCgCG 
ataGaGCgcG 


caaGaGCtCcG 
caaGaGCtCG 
ctaGaGCtCcG 
aacGcGCacG 
---G-GC-CG 


200 
AGaatgTttC 
AGaatgTttc 
AGaatgTttC 
AGaatgTttC 
AGaatgTttc 


AGgcagTttc 
AGgcaaTttC 


AGacaaTacC 
AGacaaTatC 
AGacaaTacC 
AGacagTttC 
AG----T--C 


250 
tcCtGTgGTt 
teCtGTtGTt 
tcCtGTaGTt 
tcCtGTaGTt 
gectGTaGTt 


ttCtGTaGTt 
tcCtGTaGTt 


cactGTtGTc 
taCtGTgGTc 
caCtGTgGTt 
caCcGTtGTt 
--C-GT-GT- 


300 
TacgecgecT 
TacgtcgccT 
TacgecgccT 
TacgcecgccT 
TacgccgccT 


TgagaaaccT 
TgaggaaccT 


TtaaaaattT 
TtaaaaattT 
TtaaaaattT 
TaaagaaccT 
T- --T 


350 
TATCCaAAgT 
TATCCtAAgT 
TATCCtAAgT 
TATCCtAAaT 
TATCCtAAgT 


TATCCtAAgT 
TATCCtAAgT 


TATCCtAAgT 
TATCCtAAgT 
TATCCtAAgT 
TATCCtAAgT 


Cv4Bm 


Ttgttagtag 
Ttgttagtag 
Ttgttagtag 
Tegttagtag 
Ttgttagtag 


Tagcagcatc 
Taacagcatc 


Tggettctge 
Tggcttctge 
Tggcatctge 
Tgttgtcggc 
ie 


400 
teTggTatTg 
teTggTctTg 
ttTggTatTa 
ttTggTgtTa 
ttTggTgtTa 


cetTagTacTt 
ttTggTacTt 


caTgaTatTg 
caTgaTatTa 
caTgaTatTa 
taTgaTctTa 
2°T--T--T- 
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123 bp 
0C43 
BCV 


MHV 
SDAV 
229E 
FIPV 
TGEV 
CCV 
123 bp 


123 bp 
IBV 

TCV 

RT neg 
pOC43 
PCR neg 
123 bp 


Fig. 4. Polymerase chain reaction (PCR) products for ten 
coronaviruses [0C43, bovine coronavirus (BCV), mouse hep- 
atitis virus (MHV), sialodacryoadenitis virus of rats (SDAV), 
229E, feline infectious peritonitis virus (FIPV), transmissible 
gastroenteritis virus (TGEV), canine coronavirus (CCV), infec- 
tious bronchitis virus (IBV), turkey coronavirus (TCV)] using 
the consesus PCR primers (2Bp and 4Bm, see text for se- 
quence) for the genus Coronavirus. Twenty pl of reaction 
product were run on a 4% agarose gel (NuSieve 3:1, FMC 
BioProducts, Rockland, ME) and stained with 1 pg/ml ethid- 
ium bromide. Also included on the gel were: reaction product 
from PCR using | pg of plasmid containing target sequence 
from human coronavirus (HCV)-OC43 as positive control 
(pOC43); reaction products from negative control samples 
(water only) which were carried through both the reverse 
transcriptase (RT) and PCR steps (RT neg) or the PCR step 
alone (PCR neg); | ug of 123 bp molecular size standards 
(Bethesda Research Labs, Bethesda, MD). 


teritis virus (TGEV) of swine (Elequet et al., 
1995), two different isolates of a single group 2 
virus, mouse hepatitis virus (MHV) (Pachuk et 
al., 1989; Lee et al., 1991), and the single group 3 
virus, infectious bronchitis virus (IBV) of chickens 
(Boursnell et al., 1987). Degenerate oligonucle- 
otide primers were selected by identifying the 


most conserved regions from the published IBV 
and MHV pol sequences (Boursnell et al., 1987; 
Lee et al., 1991). These primers were used to 
derive clones from three group | viruses—feline 
infectious peritonitis virus (FIPV; UCD2 strain 
provided by Nils Pedersen, University of Califor- 
nia, Davis), TGEV of swine (provided by David 
Brian, University of Tennessee, Knoxville) and 
canine coronavirus (CCV; 1—71 strain from the 
American Type Culture Collection (ATCC), cata- 
log no. VR-809, Rockville, MD), and five group 2 
viruses, hemagglutinating encephalomyelitis virus 
of swine (HEV; ATCC catalog no. VR-741), 
bovine coronavirus (BCV) (provided by David 
Brian), HCV-OC43 (provided by Ortwin Schmidt, 
University of Oklahoma School of Osteopathic 
Medicine, Tulsa), sialodacryoadenitis virus of rats 
(SDAV; provided by Trenton Schoeb, University 
of Florida, GA, from a stock originally derived 
from ATCC) and turkey enteric coronavirus 
(TCV) obtained directly from ATCC (ATCC VR- 
911). Two genome-sense primers were used in the 
PCR reactions. The 5’-most primer was 8p, 5’- 
TATGA(GA)GG(TC)GG(GC)TGTATACC-3’, 

the 5’ end of which was 52 nucleotides upstream- 
from the second genome-sense primer |Ap, 5’- 
GATAAGAGTGC(TA)GGCTA(TC)CC-3’.. One 
antigenome-sense primer was used for first-strand 
cDNA synthesis and for the subsequent PCR; 7m, 
5'-ACTAGCATTGT(AG)TGTTG(AT)GAACA - 
3’. The region amplified by these primers (1Ap/ 
7m) (including the primer sequences) corresponds 
to nucleotides 13833 through 14797 of IBV 
(Boursnell et al., 1987) and 15118 through 16082 
of MHV (Lee et al., 1991). The 1Ap/7m primer 
combination, which produced a 965 bp product, 
was used for all of the indicated viruses except for 
HEV and TCV. For these viruses, the 8p/7m 
primer combination was used, which produced a 
1013 bp product. The 922 nucleotides internal to 
the 1Ap/7m primers (919 in the case of IBV and 
TCV, which contain a three nucleotide deletion) 
were sequenced and analysed. These primers were 
also used in an attempt to characterize the puta- 
tive rabbit coronavirus (RbCV), which was first 
described from rabbits with pleural effusion dis- 
ease and has tentatively been considered a coro- 
navirus (Small et al., 1979). However, this 
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classification is not definitive (Siddell, 1995) and 
the virus is poorly characterized. The primer pairs 
(1Ap/7m, 8p/7m) did not amplify any identifiable 
sequences from a _ standard infectious serum 
(ATCC VR-920) derived from a rabbit with 
pleural effusion disease. 

Viral RNA was prepared (Chomezynski and 
Sacchi, 1987) from tissue culture supernatants or 
cellular extracts. First strand cDNA was synthe- 
sized using avian myeloblastoma virus reverse 
transcriptase (RT, Promega, Madison, WI) or 
Maloney murine leukemia virus RT (SuperscriptI 
or II, Bethesda Research Laboratories, Bethesda, 
MD). PCR was performed with 0.25 uM primers, 
from 0.025 to 0.04 U/ul Tag polymerase 
(Promega), manufacturer’s buffer containing 1.5 
mM Mg, and deoxynucleotide triphosphates (0.1 
mM each). PCR profiles involved an initial denat- 
uration for 1 min at 98°C followed by 32-40 
cycles of annealing at 45°C for 1 to 2 min, exten- 
sion at 72°C for 1 min, and melting at 94°C for 1 
min. In some cases, the final 20 cycles were per- 
formed using a 50°C annealing temperature. Am- 
plification products were subcloned into the 
pCR1000 or 2000 vector using the TA cloning 
system (Invitrogen, San Diego, CA). Inserts were 
sequenced completely in both directions with Se- 
quenase 2.0 (US Biochemical, Cleveland, OH), 
plasmid region primers, the PCR primers, and 
additional sequencing primers (not shown). Se- 
quence alignment was performed using the Lineup 
and Pileup programs from the Genetics Computer 
Group software (Devereux et al., 1984). 

The deduced amino acid sequences for this 
region of ORF 1b of the po/ gene for the 11 
coronaviruses align precisely (Fig. 1) and corre- 
spond to the highly conserved region surrounding 
the SDD or GDD polymerase motif common to 
viral RNA-dependent polymerases (Poch et al., 
1989). The only gaps in the alignment are at- 
tributable to a single amino acid deletion at posi- 
tion 16 in both IBV and TCV. All coronaviruses 
show the SDD sequence at the putative active site 
of the polymerase, except TCV, which, unusually 
for a viral RNA-dependent RNA polymerase, has 
an ADD sequence. The percent amino acid and 
nucleotide sequence identities among these 11 
viruses are shown in Table 1 and reveal identities 


which are similar to the patterns described by the 
three groups, with the single exception that the 
TCV sequence is much more similar to IBV than 
to any other coronavirus. For example, within the 
group 2 cluster of five viruses the maximum sub- 
stitution frequency is 16/100 nucleotides (compar- 
ing MHV to HEV) while among the four group 1 
viruses the frequency among CCV, FIPV and 
CCV is < 5/100 nucleotides. However, 229E dif- 
fers from these three by an average of 36 substitu- 
tions/100 nucleotides, which is consistent with the 
weaker antigenic relationship of 229E to these 
viruses (Sanchez et al., 1990). The TCV sequence 
is very similar to IBV, showing a substitution 
frequency of only 7.2/100 nucleotides, clearly sug- 
gesting that these viruses should fall within the 
same group. 

To further characterize the phylogenetic rela- 
tionships among these viruses, a dendogram was 
created with PAUP (version 3.0) using the maxi- 
mum parsimony method. A branch and bond 
algorithm was used to identify the single most 
parsimonious tree. Only one tree was identified. 
The three nucleotides missing in the IBV and 
TCV sequences (which represent a single amino 
acid deletion) were each treated as a separate 
character state rather than as missing data. The 
resulting unrooted tree is shown in Fig. 2. The 
consistency index of the tree was 0.818 and the 
rescale consistency index was 0.711. Bootstrap 
analysis was also performed and the resulting 
values are shown at branch points in the figure. 
An identical tree and essentially identical boot- 
strap values were also derived using the Kimura 
two-parameter method for calculating distances 
and the neighbor-joining method to construct the 
tree (using the Clustal W program). Again, this 
analysis reveals that published IBV sequence and 
the TCV sequence presented here are very closely 
related. In addition, the three group I viruses 
FIPV, TGEV and CCV are found on a common 
branch with HCV-229E being more distantly re- 
lated. The group 2 viruses fall into two groupings, 
with SDAV and MHV being closely related to 
one another and the remaining three viruses in 
this greoup—HCV-0C43, BCV and HEV forming 
a separate branch. 
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This phylogenetic analysis conforms closely to 
results from antigenic studies of these coro- 
naviruses, with the single exception that the analy- 
sis indicates that TCV and IBV are closely related 
viruses. Coronaviruses have traditionally been di- 
vided into three groups (Sturman and Holmes, 
1983; Siddell, 1995), including two groups of pri- 
marily mammalian coronaviruses (groups | and 2, 
although TCV was recently included in group 2, 
Siddell, 1995) and a separate, single-member 
group for the avian coronavirus IBV (group 3). 
The antigenic characterization of the second avian 
coronavirus, TCV, has been controversial. Sero- 
logic studies (Dea and Tijssen, 1989; Dea et al., 
1990) and sequence analysis of the N and M genes 
(Verbeek and Tissen, 1991) from a cell culture- 
adapted clone of the Minnesota strain of TCV 
indicate that TCV is closely related to the group 2 
mammalian coronaviruses, particularly BCV and 
HCV-OC43. However, recent serologic studies 
with both polyclonal and monoclonal antibodies 
(Guy et al., 1997) indicate that the Minnesota 
strain of TCV, as well as additional field isolates 
of TCV, are close antigenic relatives of IBV. The 
data agree with this latter conclusion. Since the 
pol gene product is not involved in the determina- 
tion of antigenic cross-reactivity among viruses, 
the data do not directly address the discrepancy 
between the results of Guy et al. (1997) and Dea 
et al. (1990), but do indicate that further work is 
necessary to resolve the contradictory finding with 
regard to the characterization of TCV. 

A goal of the sequence analysis described above 
was to identify conserved regions which could be 
targeted for the development of a consensus PCR 
assay for the genus Coronavirus. Since neither 
primer pair used in cloning these po/ gene regions 
(1Ap/7m or 8p/7m) detected all 11 coronaviruses 
used in this study, the 922 (919 in the case of IBV 
and TCV) nucleotide region internal to the 1Ap/ 
7Bm primers was compared to identify regions 
with greater sequence identity. As shown in Fig. 
3, two regions were selected to serve as targets for 
two degenerate oligonucleotide primers: primer 
2Bp, 5’-ACTCA(A/G)(A/T)T(A/G)AAT(T/ 
C)TNAAATA(T/C)GC-3’; and primer 4Bm, 5’- 
TCACA(C/T)TT(A/T)GGATA(G/A)TCCCA-3’. 
After testing different reaction conditions, a pro- 


tocol was selected in which the RT and PCR 
portions of the assay were performed essentially 
as described above, using the 4Bm oligonucle- 
otide to prime cDNA synthesis. Annealing condi- 
tions during the PCR assay were also modified 
slightly from those described above, namely: in 
the first five cycles the annealing temperature was 
40°C (2 min), followed by 35 cycles at 50°C (1.5 
min). The sensitivity of this protocol was tested 
using a plasmid containing the 965 bp HCV- 
OC43 pol sequence. The limit of detection for this 
plasmid on an ethidium bromide-stained gel was 
6000 plasmid copies (data not shown). Then this 
assay was tested on representative coronaviruses 
from each group. As shown in Fig. 4, these 
primers amplified the expected 251 bp region in 
four group | viruses (229E, FIPV, TGEV, CCV), 
four group 2 viruses (OC43, BCV, MHV, 
SDAYV), the single, currently recognized, group 3 
virus (IBV), and TCV, which is currently placed 
in group 2. In addition, these primers detected a 
fifth group 2 virus, HEV (data not shown). After 
repeated attempts, these primers did not detect 
the po/ target sequence in infectious serum from a 
rabbit with pleural effusion disease (containing 
4x 10° rabbit infectious units; ATCC VR-920). 
Thus this assay will detect all ten of the well- 
characterized coronaviruses studied here, will also 
detect TCV, but will not detect the putative 
RbCV. This result suggests that the putative 
RbCV is not a member of the genus Coronavirus. 
However, slight variations in the target sequences 
for these primers, or a lack of sensitivity of this 
assay, could also explain this negative result. 

Coronaviruses infect a variety of animal hosts 
and many uncharacterized coronaviruses have 
been implicated in a variety of diseases, particu- 
larly enteric (Resta et al., 1985; Guy et al., 1997) 
and respiratory (Myint, 1995) infections. The 
consensus PCR approach described here has al- 
ready provided novel information on the identity 
of one little-studied coronavirus (TCV), suggest- 
ing that it should be classified with IBV in group 
3. In the future, this consensus PCR approach 
should prove useful in identifying and character- 
izing additional members of the genus Coro- 
navirus. 
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